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C^ \ Abstract 

Let € denote the Clifford algebra over R", which is the von Neumann algebra generated by n self-adjoint 
operators Qj, j = 1, . . . , n satisfying the canonical anticommutation relations, QiQj + QjQi = 2SijI, and let 
T denote the normalized trace on £. This algebra arises in quantum mechanics as the algebra of observables 
^ ' generated by n Fermionic degrees of freedom. Let *p denote the set of all positive operators p G £ such that 

L^ , r(p) = 1; these are the non-commutative analogs of probability densities in the non-commutative probability 

f^ ' space (£, t). The Fermionic Fokker-Planck equation is a quantum-mechanical analog of the classical Fokker- 

\f^ , Planck equation with which it has much in common, such as the same optimal hypercontractivity properties. 

In this paper we construct a Riemannian metric on *p that we show to be a natural analog of the classical 2- 
C"^ , Wasserstein metric, and we show that, in analogy with the classical case, the Fermionic Fokker-Planck equation 

^Nj ' is gradient flow in this metric for the relative entropy with respect to the ground state. We derive a number 

of consequences of this, such as a sharp Talagrand inequality for this metric, and we prove a number of results 
pertaining to this metric. Several open problems are raised. 
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1 Introduction 

Many partial differential equations for the evolution of classical probability densities p{x, t) on R" can be viewed 
as describing gradient flow with respect to the 2-Wasserstein metric. This point of view is due to Felix Otto, and 
he and others have shown it to be remarkably effective for gaining quantitative control over the behavior of such 
evolution equations. We recall that for two probability densities po and pi on R", both with finite second moments, 
the set of couplings C{po,pi) is the set of all probability measures k on R^" such that for all test functions (p on 



ip{x)dK{x,y) = / ip{x)po{x)dx 

and 

(p{y)dK{x,y) = p{y)pi{y)dx . 



That is, a probability measure dn on the product space M^" is iiiC{po, pi) if and only if the first and second marginals 
of dK are jOo(a;)da; and pi{y)dy respectively. Then the 2- Wasserstein distance between po andpi, W(pojPi)j is defined 
by 

W2(po,Pi)- inf / \\x-y\^dK{x,y). (1) 

KeC(po,Pl) jR2n I 

One may view the conditional distribution of y under k given a;, which is po{x)~^K(x,y)dy if k has a density 
k(x, y), as a "transportation plan" specifying to where the mass at x gets transported, and in what proportions, 
in a transportation process transforming the mass distribution pQ{x)dx into pi(y)dy. The function \x — yp/2 is 
interpreted as giving the cost of moving a unit of mass from x to y, and then the minimum total cost, considering 
all possible "transportation plans", is the square of the Wasserstein distance. For details and background, see Q. 
In quantum mechanics, classical probability densities are replaced by quantum mechanical density matrices; 
i.e., positive trace class operators p on some Hilbert space such that Tr(p) — 1. These are the analogs of probability 
densities within the context of non-commutative probability theory originally due to Irving Segal p9, pO^, pl| . The 
starting point of his generalization of classical probability theory is the fact that the set of all complex bounded 
functions that are measurable with respect to some c-algebra, equipped with the complex conjugation as the 
involution *, form a commutative von Neumann algebra, and any probability measure on this measurable space 
induces a positive linear functional; i.e., a state on the algebra. In Segal's generalization, one drops the requirement 
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that the von Neumann algebra be commutative. The resulting non-commutative probability spaces - von Neumann 
algebras with a specified state - turn out to have many uses, particularly in quantum mechanics, where the L^ 
spaces built on them give a convenient representation of the operators relevant to the analysis of many physical 
systems. We shall discuss one example of this in detail below. 

If the von Neumann algebra in question is B(W), the set of all abounded operators on the Hilbert space %, there 
is no obvious non-commutative analog of the 2-Wasserstein metric. One can generalize the notion of a coupling 
of two density matrices po,pi on a Hilbert space H to be a density matrix k on 'RiSi'H whose partial traces over 
the second and first factor are po Sind pi respectively. Based on this idea, an analog of the Wasserstein metric has 
been defined by Biane and Voiculescu in the setting of free probability El. However, in general there is no natural 
analog of the conditioning operation so that in the general quantum case, there is no natural way to decompose a 
coupling, via conditioning, into a transportation plan. Moreover, since there is no underlying metric space, there 
is no obvious analog of the cost function \x — y\'^/2. 

However, there are physically interesting evolution equations for density matrices that are close quantum me- 
chanical relatives of classical equations for which the Wasserstein metric point of view has proven effective. This 
fact suggests that at least in certain particular non-commutative probability spaces of relevance to quantum me- 
chanics, there should be a meaningful analog of the 2- Wasserstein metric. As we shall demonstrate here, this is 
indeed the case. 

The prime example of such an evolution equation is the Fermionic Fokker-Planck Equation introduced by 
Gross |16l O]. As we explain below, this equation describes the evolution of density matrices belonging to the 
operator algebra generated by n Fermionic degrees of freedom which turns out to be a Clifford algebra. In this 
operator algebra, there is also a differential calculus, and Gross showed that using the operators pertaining to this 
differential calculus, one can write the Fermionic Fokker-Planck Equation in a form that displays it as an almost 
"identical twin" of the classical Fokker-Planck equation. 

As an example of the close parallel between the classical and Fermionic Fokker-Planck equations, consider one of 
the most significant properties of the evolution described by the classical equation is its hypercontractive property, 
expressed in Nelson's sharp hypercontractivity inequality 124]. The exact analog of Nelson's sharp hypercontrac- 
tivity inequality for the classical Fokker-Planck evolution has been shown to hold for the Fermionic Fokker-Planck 
evolution W , where it involves non-commutative analogs of the L^ norms in the (non-commutative) operator algebra 
generated by n Fermionic degree of freedom. 

Other significant features of the classical Fokker-Planck evolution have lacked a quantum counterpart. For 
instance, as shown by Jordan, Kinderlehrer and Otto iQ, the classical Fokker-Planck Equation for p{x,t) is 
gradient flow in the 2- Wasserstein metric of the relative relative entropy of p{x, t) with respect to the equilibrium 
Gaussian measure. Moreover, crucial properties of this evolution, such as its hypercontractive properties, can be 
deduced from the convexity properties of the relative entropy functional in the 2- Wasserstein metric. A similar 
gradient flow structure in the space of probability measures has meanwhile been developed and exploited in many 
different settings § |, % |, 0, |l|, |l|, |l|, |^, |2|, H, H, 1^. 

The purpose of our paper is to construct a non-commutative analog of the 2- Wasserstein metric, and to prove 
a number of results concerning this metric that further the parallel between the quantum and classical cases. The 
first step will be to construct the metric, and here, a judicious choice of the point of departure is crucial. Among 
the many equivalent ways to define the Wasserstein metric, the one that seems most useful in the non-commutative 
setting is the dynamical approach of Benamou and Brenier 0. In their approach, couplings are defined not in 
terms of joint probability measures, but in terms of smooth paths t k->- p{x, t) in the space of probability densities. 
Any such path satisfies the continuity equation 

-p{x,t)+Aw[w{x,t)p{x,t)]^Q (2) 

for some time dependent vector field v(a;, t). A pair {p{-, •), v(-, •)} is said to couple po and pi provided that the pair 
satisfies (0), p{x,0) — po{x) and p{x, 1) = pi{x). Using the same symbol C{po,pi) to denote the set of couplings 
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between po a-nd pi in this new sense, Benamou and Brenier show that W{pQ,pi) is given by 

W^Po,Pi)=^ inf ^l f f \v{x,t)\^p{x,t)dxdt. (3) 

Moreover, they showed how one can characterize the geodesic paths for the 2-Wasserstein metric in terms of solutions 
of a Hamilton- Jacobi equation, and how this characterization of the geodesic paths provides an effective means 
of investigating the convexity properties of functionals on the space of probability densities with respect to the 
2-Wasserstein metric. 

We may now roughly describe our main results: Working in an operator algebra setting in which there exists 
a differential calculus, and hence a divergence, we develop a non-commutative analog of the continuity equation 
(0) and show how this leads to a non-commutative analog of the Benamou-Brenier formula for the 2-Wasserstein 
difference. Actually, since there are many ways one might try to generalize (0) to the non-commutative setting, 
we start out by computing a formula for the dissipation of the relative entropy along the Fokker-Planck evolution, 
and use this to guide us to a suitable generalization of (||). 

With a suitable continuity equation in hand, we proceed to the definition of our Riemannian metric, and prove 
that the Fermionic Fokker-Planck evolution is gradient flow for the relative entropy with respect to the ground 
state in this metric. The rest of the paper is then devoted to an investigation of the properties of this new metric. 
We note that the operator algebra we consider is finite dimensional, and so the metric we investigate is a bona- 
fide Riemannian metric. Among our other results, using the known sharp logarithmic Sobolev inequality for the 
Fermionic Fokker-Planck equation [[7|, we deduce a sharp Talagrand-type inequality for our metric. 

We begin by recalling some useful background material on the classical and Fermionic Fokker-Planck equations. 

2 The classical and Fermionic Fokker-Planck equations 
2.1 The classical Fokker-Planck equation 



The classical Fokker-Planck equation is 



^/(i,x)=V-(V + a:)/(i,a:), (4) 



where f{x,t) is a time dependent probability density on K". Note that the standard Gaussian probability density 

7„(x) := (2^)-"/2e-l^l'/2 (5) 

is a steady-state solution. 

Let f{x,t) be a solution of (^), and define a function p{x,t) by 

f{x,t) ^ p{x,t)jn{x) . (6) 

Then p{x,t) satisfies 

-pit,x)^iV-x)-Vpix,t). (7) 

The solution of the Cauchy problem for m) with initial data pq(x) is given by Mehler's formula 

p[x, t)= ( PO (e-'x + (1 - e-^'f/^v) 7„(y) Ay . (8) 

(A simple computation shows that (|[) does indeed define the solution of (M) with the right initial data.) 
The Mehler semigroup is the semigroup on L^(R",7„(a::)da;) consisting of the operators 



PMx)^ f ^(e-'x + il-e-^'y^^y)j„iy)dy 
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Each of these operator is Markovian; i.e., positivity preserving with Pfl = 1. Hence the Mehler semigroup is a 
Markovian semigroup and the associated Dirichlet form is the non-negative quadratic form 

B{ip, ^p) -.^liiR- ip{x)[(p{x) - Pt^p{x)]'-fn{x) dx = \S/(p{x)\'^'jri{x)dx . (9) 

The positive operator 



satisfies 



7V:=-(V-a;)- V 



Biip,(p) = {■iI^,N(p)l2(^^^^^) 



for all smooth, bounded ■0 and ip, and then the domain of self-adjointness is given by the Friedrich's extension. 
The spectrum of N consists of the non-negative integers; its eigenfunctions are the Hermite polynomials. Since the 
corresponding eigenvalue is the degree of the Hermite polynomial, the operator N is sometimes referred to as the 
number operator. By what we have said above, N is the generator of the Mehler semigroup; i.e., Pt := e~*^, i > 0. 
There is a close connection between the Fokker-Planck equation and entropy. Given a probability density f{x) 
with respect to Lebesgue measure on R", the relative entropy of f with respect to 7„ is the quantity iJ(/|7„) defined 

by 

Hifbn) = f ( — ) log (^] -in{x) dx 

= [ f\ogf{x)dx+\ [ \x\^f{x)dx+'^\og{2n). 
Notice that if f{x) — p{x)jn{x), then 

H{f\ln)^ I p{x)\0gp{x)-in{x)dx . 



As we have mentioned above, it has been shown relatively recently by Jordan, Kinderlehrer and Otto 18 that 
the Fokker-Planck equation may be viewed as the gradient flow of the relative entropy with respect to the reference 
measure 7„ {x) dx when the space of probability measures on R" is equipped with a Riemannian structure induced 
by the 2-Wasserstein metric, and further work has shown that many properties of the classical Fokker-Planck 
evolution can be deduced from the strict uniform convexity of the relative entropy function along the geodesies for 



the 2-Wasserstein metric (see, e.g., |l], |34[ ) 



To explain the close connection between the classical Fokker-Planck equation, entropy, and the 2-Wasserstein 
metric, we first write the Fokker-Flanck equation (Q) as a continuity equation. Note that (Q) can be written as 

^/(f,x)-Hdiv[/(i,a;)v(a;,i)]-0 (10) 

where 

v(x,i) = ~Vlog(/(x,t))-a: (11) 

To see that this choice of \'{x^t) is consistent with ([lO|), write the time derivative of f{x,t) as the divergence of a 
vector field, and then divide this vector field by f{x,t) to obtain the vector field v{x,t). 
Given a solution f{x,t) of (m, there are many vector fields v'{x,t) such that 

-f{t,x)+d\v[f{t,x)^x,t)]^0, (12) 

but the choice made in ( pl| ) is special since 

\vix,t)\^f{x,t)dx< f mx,t)\^fix,t)dx 
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for any other vector field v{x,t) satisfying (n2h for our given solution f{x,t). Indeed, the set K. of vector fields 
V such that (O) is satisfied is a closed convex set in the obvious Hilbertian norm, and thus there is a unique 
norm-minimizing element vq. Considering perturbations of Vq of the form vq + e/^^w where w(a;, t) is, for each t, 
a smooth compactly supported divergence free vector field, one sees that vg must satisfy 



vo{x,t)vf{x,t)dx = 



for each t, and thus, that vo(x,t) is, for each t, a gradient. One then shows that there is only one gradient vector 
field in /C, and hence, since the vector field v(a;,<) given in (|l^) is a gradient, it is the minimizer. We only sketch 
this argument here since we will give all of the details of the analogous argument in the non-commutative setting 
shortly. For further discussion in the classical case, see ||]. 

Now, from the Benamou-Brenier formula for the Wasserstein distance, and the minimizing property of the 
vector field v(a;,i) given in (O), we see that 



W\f{;t)Ji;t + h)) 



\v{x,t)\^f{x,t)dx] h^ + o{h^) . 



Next, we compute, using the continuity equation form of the Fokkcr-Planck equation, 

1, 



^^(/|7„) 



logfix,t) + -\x\' 



div[/(t, x)v{x, t)]dx 
[W log f{x,t) + x][f{t,x)v{x,t)]dx 
W{x,t)ff{x,t)dx. 
In summary, for solutions f{x, t) of the classical Fokker-Planck equation, one has 



di 



H[f\ln) 



lim 



W(/(-,i),/(-,i + /i)) 



(13) 



(14) 



When we come to the non-commutative case, it will not be so evident how to rewrite the Fermionic Fokker-Planck 
equation in continuity equation form. The logarithmic gradient of f{x,t) enters in (^l|) because we divided by 
f{x,t) in the course of deducing the formula (|ll| ) for v{x,t). In the non-commutative case this division must be 
done in a rather indirect way to achieve the desired result, and we shall arrive at the appropriate division formula 
by working backwards from a calculation of entropy dissipation. 

First, we introduce the Fermionic Fokker-Planck equation, beginning with a brief introduction to Clifford 
algebras as non-commutative probability spaces. 



2.2 The Clifford algebra as a non-commutative probability space 

Let H be a complex Hilbert space and let Qi, . . • , Qn be bounded operators on "H satisfying the canonical anticom- 
mutation relations (CAR) 

Q,Qj + QjQ, = 2S,jI . (15) 

The Clifford algebra £ is the operator algebra generated by Qi, . . . , Qn. We say "the" Clifford algebra because any 
two realizations are unitarily equivalent. We give a brief introduction to £ here. Though fairly self-contained for 
our purposes, we refer to (7| for more detail and further references. 

One realization of £ as an operator algebra may be achieved on the Hilbert space H that is the n-fold tensor 
product of C^ with itself. Let 



1 

1 



and 



U 
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Then let Qj the be tensor product of the form 

where Xj = Q, where Xi — U for all i < j, and where Xk — I, the 2x2 identity matrix, for all k > j. Then one 
readily verifies that the canonical anti-commutation relations are satisfied. 
There is a natural injection of M" into £ given by 



J(a;):=^x,Q, . (16) 



One then sees, as a consequence of (|lj) that Jix)"^ — \x\^I, which is often taken as the relation defining £. 
Let T denote the normalized trace on £. That is, if ^ is any operator on "H belonging to C, 

t{A) = 2-" Tr(^) . 

Evidently if A is positive in C, meaning that A has positive spectrum, or what is the same, A = B* B with B in C, 
then t{A) > 0. Also evidently r(/) = 1 where / is the identity in £. Thus, t is a state on C It may appear that r 
depends on the particular representation of the CAR that we are employing but this is not the case: 
An n-tuple a = (ai, . . . ,«„) S {0, 1}" is called a Fermionic multi-index. We set \a\ :— X^^i '^j ^^'^ 

Q« := Qr • • • Q^ . 

One readily verifies that 

r{Q") = So,io.\ . (17) 



Since the {Q"} are a basis for £, there is at most one state, namely r, that satisfies (17). 

As emphasized by Segal ||2^, ^ Q, (£, t) is an example of a non-commutative probability space that is a close 
analog of the standard Gaussian probability space (R",7„(a;) da;) where 

7„(:r) := {27r)-/'e-^-^'/' . 

For instance, a characteristic property of isotropic Gaussian probability measures on M" is that if V and W are two 
orthogonal subspaces of R", and / and g are two functions on R" such that f{x) depends only on the component 
of X in V^ and g{x) depends only on the component of x in VF, then 



f{x)g{x)-/nix)dx = I f{x)-/nix)dx\ I g{x)^n{x)dxj . (18) 

That is, under an isotropic Gaussian probability law on R", random variables generated by orthogonal subspaces 
of M" are statistically independent, and as is well known, this property is characteristic of isotropic Gaussian laws. 
In the case of the Clifford algebra, let V and W be orthogonal subspaces of R", and let £y and <tw, respectively, 
be the subalgebras of C generated by J{V) and J{W). Then it is easy to see that if A S £\/ and B E <tw, then 

t{AB) = t{A)t{B) , 

the analog of (nsh. 



2.3 Differential calculus on the Clifford algebra 

The Clifford algebra becomes a Hilbert space endowed with the inner product 

{A,B)L2^r)-^riA*B), A,Be(t. 
The 2" operators (Q")ae{o,i}" form an orthonormal basis for £. 
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For i = 1, . . . ,n, we define the partial derivative by 

),Q°, a, = 1 , 



and linear extension. We will also consider the gradient 

ViC^C", A^(Vi(A),...,V„(A)) , 
It is easy to check that 

where F denotes the grading operator defined by 

F(Q°) :=(-1)I°IQ°. 

For A,Be<£. the product rule 

V^iAB)=T{A)V,{B) + V,iA)B (19) 

holds, and the following identities are readily checked: 

r{AB) = F(^)F(B) , (20) 

T{A*)^r{A)*, (21) 

T{r{A)B) = t{AT{B)) , (22) 

(V(^*))* = F(VA) == -V(r(yl)) . (23) 



By (20) and (ElJ), A M- F(^) is a ^-automorphism, and it is often called the principle automorphism in £. 



Here, and throughout the rest of this work, we use the convention that for A = (^i, . . . , An) € £" and B £ €, 

AB:={AiB,...,AnB) , BA := (BA^ ■ ■ ■ ,BAn) . 

Similarly, we will also extend an operator T acting on £ to an operator on £" in the obvious way, by defining 

TA:-(T(^i),...,T(A„)). 
The adjoint of V^ with respect to the L^(T)-inner product is given by 

^*{A) = ^{Q,A + r{A)Q,), Ae<t, 

It follows that 

0, a^ = 1 , 



'^^ '' ' QrQ" , a, = , 
and the identities 

iW*{A*)r = -r{W:A) = V*(r(A)) , (24) 

hold. As usual, the divergence operator is defined by 



div(A):=^5;]V*(A,) 



i=l 
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2.4 The Fermionic Fokker-Planck equation 

As noted above, an element ^ of C is non-negative if for some B £ €, A = B* B. An element ^ of C is strictly 
positive if for some B G €. and some X > 0, A ~ B*B + XI. Let *P denote the set of (non-commutative) probability 
densities, i.e., all non- negative elements p E € satisfying r(p) = 1. Let *P+ denote the set of strictly positive 
probability densities. The Fermionic Fokker-Planck Equation is an evolution equation for probability densities in 
£ that we now define, starting from an analog of the Dirichlet form (^ associated to the classical Fokker-Planck 
equation. 

Gross's Fermionic Dirichlet form T{A, A) on £ is defined by 



TiA, A) = T iiVAr • VA) = ^ r ((V, A)* • V,A) 



In so far as r is an analog of integration against 7„(x)da:, this is a direct analog of (0). 
The Fermionic number operator Af is defined by 

T{B,A)^{B,J\fA)L2ir) , 
and the Fermionic Mehler semigroup is given by 

for i > 0. On the basis of the connection between the Mehler semigroup and the classical Fokker-Planck equation, 
we refer to 

|p(t) = ^Np{t) . (25) 

More precisely, this is a direct analog of (0), the classical Fokker-Planck equation for the evolution of a density 
with respect to the Gaussian reference measure 7„(a;)da;, instead of with respect to Lebesgue measure, as there is 
no analog of Lebesgue measure in the quantum non-commutative setting. 

At this point it is not obvious that Vtp & ^ whenever p &^- Since Afl = 0, it is easy to see that T{Vtp) = t{p) 
for all t, but the positivity is less evident. One way to see this is through an analog of Mehler's formula that is 
valid for the Fermionic Mehler semigroup; see ||^. 

3 The continuity equation in the CUfFord algebra and the Riemannian 
metric 

We are finally finished with preliminaries and ready to begin our investigation. If we are to show that the Fermionic 
Fokker-Planck evolution is gradient flow for the relative entropy, it must at least be the case that relative entropy 
is dissipated along this evolution. We start by deducing a formula for the rate of dissipation, and proceed from 
there to a study of the continuity equation in £. 

3.1 Entropy dissipation along the Fermionic Fokker-Planck evolution 

For p e ^, we define the relative entropy of p with respect to r to be 

S{p) = r[plogp] . 
Given po € *P, define pt := Vtpo- Then 

— 5(pt) = -T [log ptAfpt] 

= ~T [{V log ptT-Vpt] . (26) 
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10 



Our first goal is to rewrite this as the negative of a complete square analogous to (fiM, with the hope of 
identifying, through this computation, the form of the "minimal" vector field in a continuity equation representation 
of the Fermionic Fokker-Planck equation. We use the following lemma: 



Lemma 3.1. For any p G *P+,, and any index i, 



Proof. Since p e *p_|_. 



V,p= / T{pf-'[Vdogp\p'ds 
Jo 



p = lim / + - log p 

fc— s-oo \ k 



(27) 



and by the product rule (19) 



vAi + \\ogp\ ^^ir^j + iiogp) [v.dogp](i 



1=0 



log/7 



k-l-l 



The result follows upon taking limits. 



□ 



Remark 3.2. It is possible to develop a systematic chain rule for V^, but this simple example is all we need at 
present. 

Combining (|2^) and (|^, we obtain 

»i 



di 



S{pt 



(Vlogp,)*- / [TptY-^ [V log pt]plds 
Jo 



(28) 



The formula ( p7| ) is the analog of the classical formula V/(x) — f{x)V \ogf{x). It suggests that the meaningful 
analog of dividing by p in £ will involve inversion of the operation 



C^ [ T{p)^-'Cp'ds 
Jo 



in £. This brings us to the following definition; 

Definition 3.3. Given strictly positive m x m matrices A and B, define the linear transformation (A, B)^ from 
the space of my. m matrices into itself by 



{A,B)ifC= f A^-'CB'ds 
Jo 



(29) 



The next theorem is not original, but as we lack a ready reference, we provide the short proof. We note that 
the A~B case is used in |19| . 

Theorem 3.4. Let A and B he strictly positive definite mx m matrices. Then the linear transformation (A,B)^ 
from the space of m, x m matrices into itself is invertible, and if (A, B)ffC = D, then 



/•oo 

C= / {A^xI)-^D{B^xI)-^dx 
Jo 



(30) 



Proof. Let {wi, . . . , u„i} be an orthonormal basis of C™ consisting of eigenvectors of A, and let {vi, . . . , w„j} be an 
orthonormal basis of C™ consisting of eigenvectors of B. Let Aui = aiU^ and Bvj — bjVj for each i and j. Then 



f {A + xl)-^ f A^-'CB' ds{B + xl)-^ dx ) v, = {u, ■ Cv^) f f 
Jo Jo J Jo Uo 



1 



1 



/o fli + X bj + X 



■ dx 



a^"6j ds . 
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Thus it suffices to sliow that 

11 



■ dx 



'b'ds = l 



/o a + X b + X 
for all strictly positive numbers a and b. If a = fo, this is immediately clear. Otherwise, one computes 

——7-— da: = r log(a/^) , 

a + X b + X a — b 

form which the desired result follows directly. D 

The inverse operation will be used frequently in what follows since it provides our "division by p" operation, 
and so we make a definition: 

Definition 3.5. Given strictly positive m x m matrices A and B, define the linear transformation {A,B)^ from 
the space of m, x m matrices into itself by 

/•OO 

{A,B)#C^ {A + xiy^C{B + xI)-^dx . (31) 

The following inequalities will be useful: 

Lemma 3.6. Let A and B be my. m matrices satisfying A,B>eI for some e > 0. Then, for all m x m matrices 
C, 

eTr [C*(A,B)#C] < Tr [C*C] < ^ Tr [C*(A,B)#C] . 

Proof. Consider the spectral decompositions A — J2i (^i^i and B — J^i '^j^j ■• where u denotes the spectral projection 
corresponding to the eigenvector u. Then we can write C = ^^ ctjUiVj for some uniquely determined Cij S C. 
Since Oi, bj > e by assumption, it follows that 



Tr 



C*{A,B)#C\ =^|c,,f Tr(u,^,) / dx 

J '^— ' Jq Oi+xb-j+x 



< 

= i Tr [C*C] , 

which proves the first inequality. The second inequality follows from the same argument. D 

We also observe: 
Lemma 3.7. Given strictly positive m x m matrices A and B, for all m x m matrices C , 

Tr[C'*(^,B)#C] >0 , 

and there is equality if and only if C — 0. Moreover, for all m x m matrices C and D, 

Tr[C*(A,S)#L'] = {Tr[D*{A,B)ifC])* . 

Proof. It follows from the second inequality in Lemma |^ that the quantity Tr [C*(yl, B)^C] is non-negative and 
vanishes if and only if C = 0. 
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Next, using the fact that for all m x m matrices C, Tr(C) = [Tr(C*)]*, and then cyclicity of the trace, 

T:i[C*{A,B)i^D]= f Tt[C*A^-''DB'] ds 
Jo 

/ Tr [B^DM^-'C] ds 

f /•! N 

/ Tr {B*A^-'CB'\ As 



(Tr[D*(A,B)#C] 



Using our new notation, we may rewrite (E8[) as 



-Sipt) = -t[{V log pt)* ■ iTipt),pt)#V log pt] 



□ 



(32) 



Note that by Lemma 3/7, the right hand side is strictly negative unless pt — I. We have now achieved a meaningful 
analog of (|l3| ) that will lead us to a meaningful definition of the continuity equation in £. Before coming to this, 
we continue by proving several formulas pertaining to the inner product implicit in ( |32| ) that will be useful later 
when we define our Riemannian metric on *p. 

Definition 3.8. Let p e ^+- For any A,B^€, define the sesquilinear form 

{A,B),:=r[A*irip),p)#B] , 



which by Lemma 3.7 is an inner product on <t. We define 

\\A\\, = ^{A,A), 
to be the corresponding norm. Similarly, for A, B G €^ we define the inner product 

n 

and the corresponding norm 



\ 1=1 

Lemma 3.9 (Properties of the inner product (•, •)p). For any A,B £ €, 

{A,B), = (r(i3*),r(A*))p. 

Moreover, if U,V e £" are self-adjoint, then (VC/, W)p G M. 
Proof. Using cyclicity of the trace and (p0|)-(p2|), 

{A,B)p = / r[A*T{pyBp'-^] ds 
Jo 

= f T [Bp^-''A*r{py] ds 

Jo 

= / T [T{B*yT{py-T{A*)p'] ds 

= {r{B*)MA*))p ■ 



(33) 
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Next, ii U — U* and V = V* we obtain using (23),(|22|), and cyclicity of the trace, 

{wu,vv)p^ f r((vc/)*-r(p)i-^.vy-p^)ds 

Jo 





1 
r(r(VC/) • Tip)^-' ■ T{VV)* ■ p") ds 



1 

t{wu ■ p^-' ■{vvy ■T{py)ds 



1 

t{{\7V)* -TipY -^lU ■ p^-')As 



(VF,VC/),. 



Since (V[/, VF)p == (W, VC/)! by Lemma 3.7, the claim fohows. D 



3.2 The continuity equation in the Chfford algebra 

Let pit) denote a continuously differentiable curve in *p+. Let us use the notation 

Then evidently, 

= Tr[p(t)]-(/,p(t))i.(,) , 

so that p{t) is orthogonal to the null space of N . Hence 

p{t)^N{N-'p{t))- 
Thus, defining 

A(i) :^ V(AA- V(t)) , 

we have 

p(t)+div(A(i)) =0 . 

To write this in the form of a continuity equation, we use the versions of "division by p" and "multiplication by p" 
defined in the previous section to define 

Y{t):^{T{p{t)),p{t))#A{t), 



Then by Theorem 3.4, we have that 

p{t) + div ((r(p(t)), p(t)) #V(i)) - . (34) 

Definition 3.10 (The continuity equation in the Clifford algebra). Given a vector field V(i) on £ depending 
continuously ow i G R, a continuously differ entiable curve p{t) in *p+ satisfies the continuity equation for V(i) in 
case ( p^ ) is satisfied. 

If p{t) is a continuously differentiable curve in £, then p{t) is self-adjoint for each t. Considering the definition 
of the continuity equation in the Clifford algebra that we have given, this raises the following question: For which 
V G €" is div ((r(p(t)), p{t)) #V) self-adjoint? The following theorem provides an answer that serves our purposes 
here: 

Theorem 3.11. For C G £ and p G *p_|_ one has 

div([r(p),p]#v(c*)) = [div([r(p),p]#vc) 

Consequently, if C is self-adjoint, then 

div([r(p),p]#vc) 

is self-adjoint as well. 
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We preface the proof with the following definition and lemma: 
Definition 3.12. We define the antilinear operator F, on € by 

r^c) :=r(c*) 

for all C e€. 

Lemma 3.13. For all p G *P+. the operators [r(p),p]^ and F, commute. 

Proof. We compute 



F,([F(p),p]#C)) 



^(/ r(p)^-^cp^ds^ 



= F 



p'C*T{p)^-'As 



f\{prT{C*)p'-'ds 
Jo 



= [F(p),p]#F,(C) 



Proof of Theorem 3.11 . Using (|24| ) and Lemma 3.13| we obtain 



div([F(p),p]#VC) 



= div Fi 



[F(p),p]#VC 



div 



[F(p),p]#F,(VC) 



Since (||) implies that F,(VC) = V(C*), the result follows. 



Example 3.14. Let p £^ be given, and define pt = Vtpo- Then by Lemma 3.1 
d 



dt 



p{t) = -Np{t) = div(vp(t)) = div((r(p(f)),p(t))#viogp(t)) 



Thus, Pt satisfies the continuity equation 

^p(i)+div((F(p(t)),p(f))#V(i)) = 
where V(t) = — Vlogp(t). We shall soon see the significance of the fact that V(i) is a gradient. 



(35) 



D 



D 



(36) 



We have seen so far that every continuously differentiable curve p{t) in *p+ satisfies the continuity equation 
for at least one time dependent vector field V(t). In fact, just as in the classical case, it satisfies the continuity 
equation for infinitely many such time dependent vector fields: Consider any p G ^+ and any vector field W G £". 
Define 

W:=(F(p),p)#W. (37) 



Then by Theorem 3.4 



div(W) = 



iv((F(p),p)#w)=0. 



div 



We have proved: 



Lemma 3.15. Let p G Cp+ and let p{t) be a continuously differentiable curve in *p_|_ such that p(0) — p. Then, for 
every t, the sets of all vector fields V G C" for which 



p(t) + div[(r(p(i)),p(f))#v]-o 
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is the ajjine space consisting of all V G £" of the form 

V = Vo + W , 

where 

Vo:=(r(p(t)),p(t))#[V(AA-VW)] , 

and W := (r(p(i)),p(i)) #W where 

div(W) = . 

Lemma 3.16. Every V G C" /las a unique decomposition into the sum of a gradient V[/ and a divergence free 
vector field Z; 

V = V[/ + Z. 

In particular, if 

r [W* ■ V] = 

whenever div(W) = 0, then Y is a gradient. 

Proof. Since div(V) is orthogonal to the nuhspace of A/", we may define U := — A/'^^div(V). Then define 

Z := V- Vt/ . 

One readily checks that V = VU + Z, and div(Z) = 0. 

Were the decomposition not unique, there would exist a non-zero vector field that is both a gradient and 
divergence free. This is impossible since the null space of A/" is spanned by /. The final statement now follows 
easily. D 

The next theorem identifies the "minimal" vector field V such that a given smooth curve p(-) in *P+ satisfies the 
continuity equation for V. As in the classical case, this identification is the basic step in realizing the 2-Wasserstein 
distance as the distance associated to a Riemannian metric. 

Theorem 3.17. Let p G *P+ and let p(t) he a continuously differentiate curve in *p+ such that p{Qi) = p. Then 
among all vector fields V G £" for which 

p(0)+div[(r(p),p)#V]=0, (38) 

there is exactly one that is a gradient; i.e., has the from V = Vf/ for [/ G C Moreover, there exists a self-adjoint 
element 5 G C such that Vt/ = VS*, and we have 

T [{vuy ■ (r(p), p) #vu] < r [v* • (r(p), p) #v] 

for all other V G £" satisfying ^3^. 



Proof. By what we proved in the last subsection, V !—> ^t [V* • (r(p), p) #V] is an Hilbertian norm on £". By 
the Projection Lemma, there is a unique element in the closed convex, in fact, affine, set 

V := { V G £" : m + div [(r(p), p) #V] = } 



of minimal norm. Note that V is non-empty by Lemma 3.15. Let V* denote the minimizer. Then by the previous 



lemma, for each i G M \ {0}, and each nonzero W such that div(W) = 0, 

T [(VO* • (r(p), p) #v,] < r [(V, + tw)* ■ (r(p), p) #(v, + tw)] , 

where W is defined by (|37|). Expanding to first order in t, and applying Theorem RA, we conclude 

fHc(T[W*-VJ)=0 
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whenever div(W) = 0. Replacing W by iW, we obtain the same conclusion for the imaginary part. By Lemma 3.16, 
this means that V* = VC/ for some U G C 

The proof we have just given shows that in fact any gradient vector field in our affine set V would be a critical 
point on the squared norm. But by the strict convexity of the squared norm, there can be only one critical point. 
Hence V(7 is the unique gradient in V. 

It remains to show that there exists a self-adjoint element 5 € £ such that VC/ = WS. For this purpose, we 
define S = ^{U + U*) and A == i(C/ — [/*). It then suffices to show that VA = 0. To simplify notation, set 
T{C) := div[(r(p),p) #VC]. Using Theorem |j^ and the fact that T{U) = -p(0) is self-adjoint, we infer that 



T{S) + T{A) = T{U) = T{U*) ^ T{S* + A*) = T{S) - T{A) , 

hence div [{T{p),p) ^VA] = T{A) = 0. Since we just proved that Vf/ is the unique minimizer in V, we infer that 
VA^O. D 



3.3 The Riemannian metric 



Theorems 3.11 and [3.17 allow us to identify the tangent space of *P-|_ with the 2" — 1 dimensional real vector 
space consisting of all vector fields in £" which are gradients of self-adjoint elements in £: If p{t) is a continuously 
differentiable curve in *p_|_ with p(0) — p (z *P+, we identify the corresponding tangent vector with VL'^, where VU 
is the unique gradient such that (BSh is satisfied. We are ready for the central definition: 



Definition 3.18. Let p G ^+, and let Tp denote the tangent space to *|5-|_ at p. The positive definite quadratic 
from gp on Tp is defined by 

5p(p(0), p(0)) := T [(VC/)* • (r(p), p) #VC/] 



where VC/ is the unique gradient such that (jffqj is satisfied. 

By what we have explained above, this is in fact a Riemannian metric, and indeed is smooth on the manifold 
*P+. Let F be a smooth real valued function on '^+. Then the gradient of F, denoted gradp(F) is the unique 
vector field on *P_|_ such that whenever p{t) is a smooth curve in *p_|_ with p(0) = p. 



^/ipm 



.gp(grad„(F),p(0)) 



t=o 



In particular, suppose that / is a real valued, continuously differentiable function on (0, cxi), and F is given by 

F{p)^T[f{p)]. 



Then by the Spectral Theorem, 

Writing 

and integrating by parts, this becomes 

d 



di 



F{p{t)) 



r[f\p)m] ■ 



t=o 



p(o) + div((r(p),p)#vc/) = o 



dt 



Hpit)) 



t=0 



r[(W{f'{p))r-{T{p),p)#WU] 
gp{V{f'{p)),VU) . 



This computation shows that for a function F on *p+ of the form F{p) = T[f{p)], 

gradpF = V/'(p) . 



(39) 



(40) 
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Definition 3.19. Given a function F on *p+ of the form F{p) = T[f{p)] where f is smooth on (0, oo), the gradient 
flow equation for F on *P+ is the evolution equation 

d 



^-pit) + div [{T{p{t)),p{t)) # (-gradp(,)j 
which by (u3j is equivalent to 







^p(t) + div [(r(p(t)), Pit)) # (-grad,(,)5 



= 0, 



Ap(i) = div mpit)), pit)) # ivf'ipit)))] . (41) 

We Irave now completed the work required to prove our first main result: 
Theorem 3.20. The flow given by the Fermionic Mehler semigroup is the same as the gradient flow 

_d 

where Sip) is the relative entropy function T[p\ogp\. 

Proof. Note that Sip) = r [fip)] where /(r) — rlogr. Since /'(/?) = 1 + logp, we have 

div[(r(p),p)#(grad,5)] =div[(r(p),p)#(Vlogp)] . 

Comparison with ( |36|) concludes the proof. D 

This shows once more that if p e *p, and pit) := VtP, then Sipit)) is a strictly decreasing function of t with 
limf^oo Sipit)) = 0. In fact, one can say more: Reversing the steps in the basic computation that led us to to the 
definition of the Riemannian metric, we have 

gpit)ipit),m) = T[iV\ogpit))*-iTipit)),pit))#V\ogpit)] 
= T[iV\ogpit))*-Vpit)] 

= -^.Sipit)). (42) 

The next lemma quantifies the rate of dissipation of entropy: 

Lemma 3.21 (Exponential entropy dissipation). Let pit) be any solution of the Fermionic Fokker-Planck equation. 
Then 

Sipit)) < e-^'SipiO)) . (43) 

Proof. This is a direct consequence of Gronwall's inequality, (E^), and the modified Fermionic Logarithmic Sobolev 
Inequality 

Sip)<^r[iVp)*-V\ogp] , (44) 

for which a simple direct proof is provided in [g[ . D 

Remark 3.22. It is worth noting here that (Hj) can be deduced from the (unmodified) Fermionic Logarithmic 

Sobolev Inequality 

5(p)<2^(pV2^^i/2)^ (45) 

that was proved in M. To see that (BS) implies (ph, we recall a basic inequality of Gross (see Lemma 1.1 of p7|), 
which says that for all p G *P, and all 1 < p < oo, 

r [(v//2) * . V//2] < (P/Dlr [(Vp)* • VpP-i] . (46) 

Taking the limit p — >■ 1, one obtains the corollary: 

J-(pi/2,pi/2)= J(vpi/2)*.Vpi/2l <ir[(Vlogp)*.Vp] . (47) 



4 



Combining this with (Ea) we obtain (Ej). 
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4 A Talagrand inequality and the diameter of ^ 

4.1 Arclength, entropy and a Talagrand inequality 

We begin our study of properties of the Riemannian manifold *P+ equipped with the metric gp defined in the 
previous section. 

Definition 4.1. Let 1 1— > p{t) be a continuously dijferentiable curve in *p_|_ defined on (a, 6) where — oo < a < 6 < 
+00. Then the arclength of the curve p{-), arclength[/9(-))], is given by 



arclength[p(-)] := / J gp(^i){p{t), p{t))At 

J a 



Of course, the arc length is independent of the smooth parameterization, and it is always possible to smoothly 
reparameterize so that a — Q and 6 = 1. As usual, this is taken advantage of in the next (standard) definition: 

Definition 4.2. For po, Pi G *P+, the set C{po, pi) of all couplings of po and pi is the set of all maps t i— > p{t) from 
[0, 1] to *P+ that are smooth on (0, 1), continuous on [0, 1] and satisfy p(0) = po and p(l) = pi. The Riemannian 
distance between po and pi is the quantity 

d{po,Pi) = inf { arclength[p(-)] : p(-) E C(po,Pi) } . (48) 

In what follows, when we refer to the Riemannian distance on ^+, we always mean the distance defined in (Uqj. 

Writing things out more explicitly, for any two po, pi G *P+, 

d{po,pi)=mil ^Jgpi^t)ip{'t)^Pi't))<it ■ p(-) e C(po,pi) 
Yet somewhat more explicitly, 

d(po,pi) = inf|y" ||VC/(i)|lp(t)dt :p(-)eC(po,pi) , 

p(i) + div((r(p(t)),p(t))#Vt/(t))=0 



where 

l|Vt/(i)||,(*) := v/r[(VC/(t))*(r(p(t)),p(t))#Vt/(t)] . 

This is a direct analog of the Brenier-Benamou formula for the 2-Wasserstein distance Q , which in turn follows 
from Otto's Riemannian interpretation of the 2-Wasserstein distance p6[ . 

Our first goal is to bound the diameter of *P-f in the Riemannian metric. We do this using a Fermionic 
analog of Talagrand's Gaussian transportation inequality p3]. The direct connection between logarithmic Sobolev 
inequalities and Talagrand inequalities was discovered by Otto and Villani pq] . Our argument in the present setting 
uses their ideas, but is also somewhat different. 

Theorem 4.3 (Talagrand type inequality). For all p £ *p+, 

dip,I)<^/2Sip) . (49) 

Proof. Given p e *p+, define p(i) — Vtp for t e (0, oo). Since limt^oo p{t) = I, it follows that 



d{pj) < arclength[p(-)] = / y.gp(t)(p(i),p(i)) dt 



CM January 28, 2013 19 

By dH), gp(t)(p(t),p(t)) = - — Sip{t)) so that for any < ti < tz < oo, 

^gpit){p{t),m) di < Vh^h^S{p{h)) - S{p{t2)) . (50) 

Fix any e > 0. Define the sequence of times {ifc}, fc G N, 

Sipitk)) = e-^^S{p) . 



(Since S{p{t)) is strictly decreasing, t^ is weh defined.) By Lemma 3.21, for each k 



e 
tk — tk-i < - 



Then by (pO[), with this choice of {tk}, 

f^" ^g,^,)p{t){p{t),p{t))At < ^|(e-(^-i)^-e-'=^)5(p) 

Since 

^ e-^^/2 Ve(e^ - 1) = Im ^ g-'^'^/^e = / e'^/^ dx = 2 , 

we obtain the desired bound. D 

4.2 The diameter of q}+ 

Since 



sup{5(p) : peq3+}--log2 



we have proved: 
Lemma 4.4. 



diam(fp+) < 2Vwlog2 . (51) 

There are other ways to bound the diameter. Given p e *p, define p{t) = (1 — t)p + tl. Then p(-) G C(p, /), and 
p{t) = I — p for ah t. As we have seen, p{t) satisfies the continuity equation 

^p(i) + div[(r(p(i)),p(t))#v(f)] = o, 

where 

v(i) = (r(p(t)),p(t))#v(AA-i(/-p)). 

By the variational characterization of the tangent vector given in Theorern |3.17| , 

9pitMt).m) < (v(t),v(t)),(t) 

= T [(VAA-i(/ -p))*- {T{p{t),p{t))#\/U-\I - p)) 
Since p{t) > tl, Lemma |3.6| implies that the right-hand side can be bounded from above by 



— T 
t 



Thus we have the bound 



(VAA-i (/ -p))*- VN-\I p)] = \t [{I p)N-\l p)] 

/■' 1 

d{pj)<\\l-p\\mr) j^ ^dt-2||/-H|L2(,) . 



This, however, is a cruder bound than the one we obtained using the entropy. 
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4.3 Extension of the metric to *P 

Our next aim is to show that the distance function d defined on *P-|_, can be continuously extended to *p. We shaU 
see however in Section H that, even in dimension 1, the Riemannian metric gp does not extend continuously to the 
boundary of *p. 

Proposition 4.5. Let po, pi G *P and let {po }«, {Pi }n be sequences in *P+ satisfying 

t[IpS-PoI']^0, r[|p5'-pip]^0, (52) 

as n ^f cx). Then the sequence {c?(Po'/3i)}ri is Cauchy. 

Proof. By the triangle inequality, it suffices to show that d{pQ, p™) ^- as n, m — > oo. 

For this purpose, we fix e e (0, 1), set p := (1 — e)pa + e/, and take iV > 1 so large that t [|pq — poP] < e^ 
whenever n > N. Fix n > N and consider the linear interpolation p(t) = (1 — t)pQ + tp. Since p{t) > tel for 



i S [0, 1], it follows from the definition of d and Lemma 3.6 that 



dip^,p)< f V^r[p(i).(r(p(i)),p(i))#p(t)]dt 



-i^w^' 



Since 



r[\p{m=r[\po~p-o+e{I-po)\'] 

<2r[\po-p^f]+2e'r[\I-pof] 

<2[l + r[\I-po\'])e\ 

we infer that d{pQ,p) < C^fe for some C depending only on po- It follows that d{p^,p^) < IC^fe for n,m > N, 
which completes the proof. D 

In view of this result, the following definition makes sense: 
Definition 4.6. For po,pi G *P uie define 

dipo,Pi) ■■= lini d[p^,p'l) , 
where {Po}n,{Pi}n are arbitrary sequences m *P-|_ satisfying (p2). 

Clearly, for po, pi G *P+, this definition is consistent with the one given before. Note also that d{po, pi) is finite, 
since *P+ has finite diameter by (pi]). 



We have now proved, in view of Lemma 4.4: 
Theorem 4.7. 



diampp) < 2 ^n- log 2 . (53) 

5 Characterization of geodesies and geodesic convexity of the entropy 

5.1 Geodesic equations 

Our next aim is to characterize the geodesies in the Riemannian manifold *p+: A (constant speed) geodesic is a 
curve u : [0, 1] -^- *P satisfying 

d{u{s),u{t)) ^ \t~s\d{u{0),u{l)) 
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for all s, t € [0,1]. Such curves must satisfy a Euler-Lagrange equation that we shall now derive for our Riemannian 
metric. In order to make the argument more transparent, we make a brief detour to a more abstract setting. See 
( p5|) below for the interpretation of the terms in our Clifford algebra setting. 

Let {V, (•,•)) be a finite-dimensional real Hilbert space. Let M^ C ^ be a linear subspace, fix z e V\W, consider 
the affine subspace Wz '■— z + W, and let M C Wz be a relatively open subset. Let D : M -^ ^{W) be a smooth 
function such that D{x) is self-adjoint and invertible for all x G M. We shall write C{x) := D{x)~^. Consider the 
Lagrangian L : W x M ^ R defined by L{p,x) = {C{x)p,p) and the associated minimization problem 

L{u'{t),u{t)) dt : m(0) = Mo , "(l) = "i 



inf 

u(-)ec^([o,i],Ai) 

where uq,ui € M are given boundary values. 

Then the Euler-Lagrange equation ■^Lp{u' , u) ~ Lx{u' , u) = takes the form 

±Ciuit)yit) - ^{dMuit))u'{t),u'{t)) = . 

Using the identity dxC{x) = —C{x)dxD{x)C{x) and the substitution v{t) :— C{u{t))u'{t) we infer that the Euler 
Lagrange equations are equivalent to the system 

r u'{t) ~ D{u{t))v{t) = , 

I v'{t) + \{dxD{u{t))v{t),v{t)) =0. 
We shall apply this result to the case where 

V ={Ae<i -.A self-adjoint} , (•, •) = (•, •)i2(^) , 
W = U-={A(^€ -.A self-adjoint, r(A) = 0} , M = <p+ 
and for any p G *P+ the operator D{p) : ^q —^ >to \s given by 

i^(p):(7^-div[(r(p),p)#VL/] . 



(54) 



(55) 



Note that D{p) is invertible for any p e ^+^ as follows from Theorem 3.17 and the fact that the null space of V 
consists of multiples of the identity operator. Furthermore, using Lemma 3^ we infer that (C/, -D(p)F)i2(^) G M for 
all L/,V"G Co, and 

{U,D{p)V)LHr) = (V(7,VV^)p = {VV,VU)p = {V,D{p)U)mr) , 

hence D{p) satisfies the assumptions above. In order to apply (p3) we use the more general chain rule provided in 



the Appendix in Propositions |A.1| and 

_d 
dt 



to compute 

{p + tar = 



p' 



a~P 



t=o JO Jo il-s)I + sp {l-s)I + sp 

for any < a < 1, p G *P+, and cr G Co- Consequently, for [/ G Co, 
d 



dl3ds 



dt 



d 
dt 



{D{p + ta)U,U)L2ir) 



t=Q 
1 





(VC/) 



(VC/)* • T{p + tcr)i-" • Vt/ • (p + ta)" da 
Tip + ta)A -VCZ-pi-" 



(vc/)*-r(p) 



t=0 

1-Q 



VC/ 



"'0 "'0 



ivuy 



_d 
dt 

r(p)"-^ 



{l-s)I + sT{p) 



{p + taT 



T{a) 



da 

npy 



+ {\juy •r(p)i-" -VC/- 



p" 



{l-s)I + sT{p) 
P' 



VU -p^ 



{l-s)I + sp {l-s)I + sp 



• Vt/ 



d/3 da ds 
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Using cylicity of the trace and the identities (EG) - (E3[) we obtain 

{Dip + ta)U,U)L2ir) 



_d 
di 



t=o 
T\ a 



1 /.I /.Q 



2r cr 



^0 ^0 



1 /.I /.a 



"'0 .^0 



(1- 


- s)I + sp 


(1- 


- s)I + sp 

pa-P 



r(v[/)-r(p)i-"-r,(vc/) 



(vc/)* • r(p)i-" • v[/ • 



(l-s)/ + sp 



(l-s)/ + sp 



(1 - s)I + sp 



(V[/)* • r(p)i-" • VC/ • 



(1 - s)/ + sp 



Afi dads 
dfidads 



Therefore the foUowing definition is natural. 
Definition 5.1. For p G *P+ and Vi, V2 G £" we set 



pb(Vi,V2) = 2 



1 /.I /.a 



"'0 "'0 



Mi-npY 



p" 



dj3dads , 



_(l-s)/ + sp ' '" {l-s)I + sp_ 

Remark 5.2. If p, r(p), Vi and V2 all commute, it is easy to explicitly compute the integrals and one finds 

pb(Vi,V2)=Vi-V2 

in this case. 

With this notation the identity above can be rewritten as 



_d 
di 



{Dip + ta)U,U)L2(r) - (a,pb(VC/,Vt/))i.(,) 



t=o 



and in view of ( p4|) we have proved the following result: 

Theorem 5.3. The geodesic equations in the Riemannian manifold *P+ are given by 

( p(f) + div[(r(p(i)),pW)#vt/(t)] =0, 



u{t) + y{t)Hvu{t),vu{t)) 







(56) 



Remark 5.4. These equations should be compared with the geodesic equations in the Wasserstein space over 
which are given by 

dtp + V-{pVU) -0, 

atV' + ilvt/p =0. 



(57) 



The Fermionic analogue is similar, but note that the second 'Hamilton- Jacobi-like' equation in (pq) depends on 



p. However, as explained in Remark 5.2, this dependence is trivial in the presence of sufhcient commutativity, in 
which case (pfl) reduces to an exact analog of ( pT] ) 

5.2 The Hessian of the entropy 

Now we are ready to compute the Hessian of the entropy. 
Proposition 5.5. For p E *p+ and U E €q we have 



ResSpS{VU,VU) = ((r(p),p)#VC/, VAAC/))i.(,) - -{Mp,pbiVU,VU))mr) 



(58) 
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Proof. Let p{t) € *P+ and U{t) € €o satisfy the geodesic equations (pq). We shall suppress the variable t in order 
to improve readability. Using Lemma pj we obtain 

^5(p) = -(/ + logp,div((r(p),p)#VC/))^,(^j 

= (viogp,(r(p),p)#vc/)^,(^) 
-((r(p),p)#viogp,vc/)^,(^) 

Therefore, using the geodesic equations (p6|), 

= -(div((r(p),p)#VC/),AAC/)^,(^^ + {Afp,dtU)^,^^^ 

□ 

Remark 5.6. The expression ( |5q ) is analogous to the one for the Hessian of the Boltzmann-Shannon entropy 
H{p) = /j[j„ p{x) \ogp{x) dx in the Wasserstein space over R". In that case, 

Hessp iJ(Vt/, VC/) = / (pS/U ■\7{-A)U-^{-Ap)\\/U\Adx . (59) 



Note that —A, like Af, is a positive operator, which is why we have written ( p9[ ) in terms of —A. In this classical 
setting, one may simplify ( p9| ) using the identity 

iA|VC/p = VC/ • VAC/ + ||Hess(C/)||2 

where ||IIess(C/)|p denotes the sum of the squares of the entries of the Hessian of U. Thus, ( p9| ) reduces to 

Hessp i/(V[/, VC/) = /" ||Hess(C/)f pdx , 

which manifestly displays the positivity of HesSp H, and hence the geodesic convexity of the entropy H. We lack a 
simple analog of 

AA(pb(VC/,VC/)) , 

and thus we lack a simple means to show that the Hessian of S is positive in €. In the final section of the paper, 
we shall show that in fact it is strongly positive in that one even has, for n = 1, 2, 

HesSpS'(VC/,VC/) > ||VC/||^ . 

We conjecture that this is true for all n. This conjecture is supported by the close connection between Logarithmic 
Sobolev Inequalities and entropy, and because the Logarithmic Sobolev Inequalities would be a classical consequence 
if this convexity is true. 

Remark 5.7. In addition to the conjecture made in the previous remark, there are many open problems. In the 
classical case, gradient flows of all sorts of information theoretic functional of densities lead to physically interesting 
evolution equations. Whether this is the case in the quantum setting remains to be seen. 

Another open problem concerns the curvature of *P in our metric. As Otto has shown, the 2- Wasserstein metric 
on the "manifold" of probability measures has non-negative sectional curvature, which has significant consequences 
for the general study of gradient flows in the 2- Wasserstein metric. At present we lack any information on the 
sectional curvature in *p. 
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Our next aim is to prove Proposition 5.11, which asserts that non- negativity of the Hessian imphes that the 
entropy is convex along geodesies in the metric space (*P,d). Since the Riemannian metric degenerates at the 
boundary of *p, this implication is not obvious. In order to prove this result, we adapt the Eulerian approach from 
d^ m to our setting. 

To carry out the calculations efficiently, we compress our notation at this point. For p G *p+, define 



p:= 



npY 



p" da e £ (g) e: 



With this notation we can write 



(r(p),p)#A = p*A, 



(60) 



where * denotes the contraction operation 

(A (g) S) * C := ACB 
Given a curve 1 1-> p{t) G *p+ it will be useful to calculate ■^p{t). 
Lemma 5.8. Let t i— > p{t) G *P+ be a smooth curve. Then 



dt 



m 



"'0 "'0 



r(p(t)) 



l-a 



Pit) 



a~p 



(1 - s)I + sp{t) 



m 



pity 



(1 - s)! + spit) 
Tipit))^ 



(1 - s)I + sTipit)) 
Proof. By the product rule, we have 



[i ~ s)i + sr{p{t)) 



(g)pit)^ 



d(3 da ds 



'^ m := ^ r(p(i))i-" ® A^(i)" + ^(r(p(i))i-) ^ Pit)- da . 



dt 



dt' 



dr 



Therefore the result follows from the fact that 



d ^ f^ r Pit)--^ 

dt^^' Jo Jo il -s)I + sp{t) 



Pit) 



Pity 



(1 - s)I + sp{t) 



d/3ds, 



which is a consequence of Propositions A.l and | 

This leads to the following definition. 
Definition 5.9. For p G *P+ we define J\f{p) e it® it by 



D 



1 /.Q 



npf 

Jd Jo 

rip)-0 



^ip) = 



Xl-s)I + sT{p) 
Then we have the following result. 
Lemma 5.10. If p{t) — Vtp, then 



r,a-/3 



■JVp- 



n^fp) 



(1 - s)I + sp (1 - s)I + sp 

T{py 



il ~s)I + sV{p) 



d/3 da ds 



dt 



Pit) = -Uipit)) 



Proof. This is an immediate consequence of Lemma bM and Definition p^ 



D 



Now we are ready to state the announced result. Since parts of the argument are very similar to pT| , we shall 
only give a sketch of the proof. 
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Proposition 5.11. Let k G R. // HesSp S'(VC/, VC/) > k{'S/U,'S/U)p for all p G *p+, then for all constant speed 
geodesies u : [0, 1] — )• *P we have 

S{u{t)) < (1 - t)S{u{Q)) + tS{u{l)) - |f (1 - t)d{u{Q),u{l)f . 

Proof. For p G *P and t/ G Co we set 

^(p, [/) = \\vu\\l = {p* vu , vu)mr) , 

B{p,U) = Hessp 5(VC/, Vt/) = (p* VC/ , VNU)mr) - \{Mp * VC/ ,VU)mr) ■ 

Let {p'*}se[o,i] be a smooth curve in *P_|_ and set pi := Pstp'* for i > 0. Let {C/t''}se[o,i] be a smooth curve in Co 
satisfying the continuity equation 

a«p,^+div(p|*VC//) = 0, sG[0,l]. 

We claim that the identity 

\dtA{pl Ut) + dsS{pl) = -sB{pl un 

holds for every s G [0, 1] and i > 0. Once this is proved, the result follows from the argument in ||ll|, Section 3] (see 
also Esl Theorem 4.4] where this program has been carried out in a discrete setting). 
To prove the claim, we calculate 

dsS{pt) = {I + log pt ,dspt)^,^^^ 

= -{l + \ogpt ,div{pt*^U))^,^^^ 

= (Vlogp?,p^VC/,Oi.(,) (61) 

Furthermore, 

^dtAipt, u!) = {pi * dtS/u! , vc//)^,^^^ + i(atpf * vt/f , vc/,^)^,^^) 

In order to simplify /i we claim that 

- div {{dtpt) * Vt/t") - div {pI * dtVU^') = sAf{ dw{pt * VUt')) ~ Npl , (62) 

dtpl = -sNpl . (63) 

To show (B2|), note that the left-hand side equals dtdsP^, while the right-hand side equals dsdtpl- The identity (|63|) 



follows from Lemma 5.10 



Integrating by parts repeatedly and using (|6l|), ( |62| ) and (p^), we obtain 

/i = -([/|,div(p?*a,vc/,^))^,(^, 
+ ([/,% div ((a,p:)*vc/,^))^,(^^ 

= -dsS{pl) - s{pt * Vt/,'' , VAA[/Oi.(,) + s(AAp? * VU^ , V(7,0i.(,) ■ 
Taking into account that 

the result follows by summing the expressions for /i and h- □ 
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6 Direct verification of the 1-convexity of the entropy 

Our results in this section support the conjecture made in Remark p^. We shall show that for n — 1,2, the entropy 
is 1-convex along geodesies in the metric space (*P, d). This notion of convexity may be seen as a Fermionic analog 
of McCann's displacement convexity [Q, which corresponds to convexity along geodesies in the 2-Wasserstein space 
of probability measures. 

6.1 The 1-dimensional case 

In this section we shall perform some explicit computations in the Riemannian manifold *P+ in the special case 
where the Clifford algebra is 1-dimensional. 

In this case the Clifford algebra is commutative and consists of all elements of the form X — xl + yQ with 
x,y & C and Q — Qi. The set of probability densities is given by 

^^{Py^I + yQ : -l<y<l}, 

and py belongs to ^+ if and only if — 1 < y < 1. Our aim is to calculate the distance d{pyg,py^) explicitly. For this 
purpose, we observe that for p > 0, 

_ {i + y)p + {i-y)p ji + yr-ji-yr ^ 

"2^2^ 

~- Cp{y)I + dp{y)Q . 

Note also that {r{py))P = Cp{-y)I + dp{-y)Q. Therefore, if [/ = uqI + uQ and V = VC/ = ul, then 

(T{py), py)^Y = J (ci-p{-y)I + di^p{~y)Q) [cp{y)I + dp{y)Q) Ap ■ ul 

1 
(1 — y) ^PyP Ap ■ ul 



y 



-ul 



arctanh(2/) 
We infer that 

V 



div(r(p,),p,)#v 



arctanh(y) 
hence, if p(<) = Py(t) ^^nd VC/(i) = u(t)I, then the continuity equation 



is equivalent to 



Furthermore, since 



^p(i) + div ((r(p(t)), Pit)) #vu{t)) = 



y^'^ - arctanhfaW) -(^) = ' " ^''^ 



1 Vt/(t)|l^(,) = -(C/(i), div ((r(p(t)). Pit)) #VC/(i)))^,(^) 

y{t) 



arctanh(2/(t)) 



u'it) , 
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we obtain for yo, j/i G (—1, 1), 

^(p.,P.J^ = mf{|-^£|-^.^Wd.}, (65) 

where the infimum runs over ah smooth functions y : [0,1] — > (—1,1) and w : [0, 1] — > M satisfying (p3) with 
boundary conditions j/(0) = j/o and y(l) = j/i. 

This metric coincides with the Riemannian metric studied in [^, Section 2] in the special case of a Markov 
chain X on a two-point space X = {a, b} with transition probabihties K{a, a) — K{a, b) = K{b, a) = K{b, b) — j. 
The minimization problem in (|65|) can be solved explicitly (see |20|, Theorem 2.4]), and for — 1 < yo < J/i < 1 one 
obtains 



d{Pvo,Pv)^ \ dy . (66) 



Note that the function y i-^ A/ arctanhC 1 diverges as y — > ±1; this corresponds to the fact that the Riemannian 
metric degenerates at the boundary of *P+. However, the improper integral in ( pSj ) does converge if yo = —1 or 



yi — 1, which can be seen directly and can also be inferred from Theorem 1.3 and Proposition 1.5, 

Let — 1 < yo < 2/1 < 1- It has been shown in Efl, Proposition 2.7] that the geodesic equation for a curve 
[0, 1] 9 1 1-> pyf^fj G *p^ connecting py^ and py-^ , is given by 



//,^ ,/ ^ /arctanh(y(t)) 
y (t) = d{py„ , py, ) J — . (67) 



Moreover, if y{t) satisfies (|67|), then the second derivative of the entropy is given by 

d%. X d{py„,py,)^ f^ ^ 1 yjt) \ 

■^^{Py{t}) = I 1 + ^-^ - - - I 

which implies that 



^^2^^Pyit)' 2 V^ ' l-yW^arctanh(y(t)); ' 



S{Py(t)) < il-t)S{py„)+tS{py,)- ^t{l-t)d{py„,Py,f , 

thus S is 1-convex along geodesies. We refer to |20[ Section 2] for more details. 

6.2 The 2-dimensional case 

As in the 1-dimensional case, our goal is to obtain an explicit formula for the Hessian of the entropy S and to show 
that it is bounded from below. First we shall describe the set of probability densities. For this purpose, it will be 
useful to introduce the notation 

Pr = I + xQi + yQ2 + izQiQ2 

for r = {x,y,z) £ C^. 

With this notation, the set of probability densities can be characterized as follows. 

Lemma 6.1. We have 

<P = {pree: : r={x,y,z)£B}, 

where B denotes the closure of the unit ball in M.^ . Moreover, p^ belongs to *P+ if and only if r belongs to the open 
unit ball B. 
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Proof. Let X G £ be of the form 

X ^ w + xQi + yQ2 + izQiQ2 

for some w, x,y, z G C. Clearly, X is self-adjoint if and only if w, x,y, z G M. In this case, one readily checks that 
the spectrum of X consists of the two elements 



w ± \/x^ + y"^ + z^ , 
both of which have multiplicity 2. This implies both assertions, taking into account that t(X) = w. D 

In order to obtain explicit formulas for expressions of the form {T{p), p)^\IU with p G *P+ and [/ G £oi one 
needs to evaluate fractional powers of p. The following result describes the functional calculus of elements in *p. 

Lemma 6.2. For r e B\ {0} and / : [0, 2] ^ R we have 

/(l-|r|) /(l + |r|) 

/(Pr) = ^ P-n + ^ Pn , 

where n ~ j^r. 

Proof. One easily checks that an element 

X = w + xQi + yQ2 + izQiQ2 

is a projection if and only \i X — -^Pr foi' some r G 9i?, where dB denotes the unit sphere in R'^. Furthermore, two 
projections X'^^ = \py.{i) and X'^^ — ^p^(2) arc mutually orthogonal if and only if r^^^ = — r^^^. As a consequence, 
the spectral decomposition of pr with r G -B is given by 

r = (l-|r|)P(_) + (l + |r|)P(+) 

where P(-|-) = ^p±n and n = ^r. This implies the desired result. D 

In the following computations, an important role will be played by the logarithmic mean p{x, y), which is defined 
for x,y > hy 

p{x,y)^ f a;i-"y"da. 
Jo 

Let us fix the notation that shall be used throughout the remainder of this section. We consider a fixed element 
p G Cp_|_ of the form 

p = I + xQi + yQ2 + izQiQ2 

for some x, y, z G R satisfying 



r := ^Jx"^ +y^ + z^ G (0, 1) 
It will be useful to introduce the quantities 

9 := /i(l - r, 1 + r) = 



arctanh(r) 
Furthermore, we set a := x/r , b :— y/r, c :— z/r, and 

m = (—a, —6, c) , n = (a, b, c) . 
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Lemma 6.3. Let p G *P+ and t/ G £. With the notation from above we have 



(r(p),p)#f/=- J2 /^(l+eir,l + £2r)peimC/pe 



4 

ei,e2e{-l,l} 



Proof. This readily follows from Lemma 3.2, D 

With the help of this lemma, it is straightforward to obtain the following identities. 
Lemma 6.4. For p G *P+ the following identities hold: 

{T{p),p)#I = (e{a^ + b^) +c^y + tbc{l - e)Qi - mc(l - e)Q2 + icrQ^Q^ , 

{T{p), p)#{iQi) = bc{l ~ 9)1 + i{0{a^ + c2) + b^)Qi - iab{l - 0)Q2 + ibrQ^Q^ , 
(r(p), p)#{iQ2) = -ac{l - 9)1 - iab{l - 9)Qi + i{9{b' + c") + a^)Q2 - iarQ^Q2 . 



Proof. This follows from a direct computation based on Lemma |6.3| . D 

Using this lemma we can obtain an explicit expression for the Riemannian metric. With the notation from 
above we obtain the following result. 

Lemma 6.5. Let p G Cp+ and let U € € be of the form U = uQi + vQ2 + iwQiQ2 for some u,v,w ^ M. Then 

(VC/, VC/)p = u^Af (p)u , 

where the right-hand side is a matrix-product with u^ — {u,v,w) and 

/ 9{a^ + b^)+c^ {9-l)ac 

M{p) = 9{a^ + &2) + c2 {9~ l)bc 

V {9-l)ac {9~l)bc a^ +h^ + 9{l+c^) 

By a similar calculation one can compute the fist term appearing in the expression ( pq ) for the Hessian of the 
entropy S at p. 

Lemma 6.6. Let p G *P+ be as in Lemma \6.!\ and let U £ € be of the form U = uQi + VQ2 + iwQiQ2 for some 
u,v,w Cz M. Then, 

{(T(p),p)#VU,VNU))mr) = u^A^i(p)u , 

where the right-hand side is a matrix-product with u^ ~ {u,v,w) and 

/ 9{a^ + b^)+c^ f(6l-l)ac 

Ni{p)^i 9{a^+h^) + c'^ |(6'-l)6c 

V |(6'-l)ac §(6i-l)6c 2{a^ + b^ + 9{l + c^)) 

With some additional work the second part in the expression d58) for the Hessian can be characterized as well. 
It turns out that the following generalization of the logarithmic mean plays a role. For x,y, z > we set 

/>! /•a 

H{x, y,z)^2 / x^-^y^-Z^z^ d/3 da . 
Jo Jo 

The following result gives an explicit expression for {T{p), p)ff^U. 
Lemma 6.7. For p G *P+ and Vi, V2 G £^ we have 

U\r \r \ ^ S^ A^(l+£ir, l + £2r,l + £3r) 

ei,e2,e3e{-l,l} 
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Proof. This follows using Lemma 6.2 and the definition of pb(Vi,V2). D 



The identity from the previous lemma allows us to obtain an explicit expression for the second term in the 
Hessian of the entropy: 

Lemma 6.8. Let p Cz^+ and let U Cz € be of the form U = uQi+vQ2+iwQiQ2 for somen, v^w gM.. Furthermore, 
we set 1^ — /i(l — r, 1 — r, 1 + r) and rj = /i(l — r, 1 + r, 1 + r), and we consider the quantities 

„ r ri — f . r / £ ri 



4 6 4 \l-r 1 + r 

Then, 

-^{Afp,pHVU,VU))mr) = u^N2{p)u 
where the right-hand side is a matrix-product with u^ = {u,v,w) and 

/ A aC 

N2{p) =0 A bC 

\aC bC B 

with 

A=(l-c2)((l + c2)A-2c2r) , 
B = (1 + c2)^A + 2c?{l - c^)V , 
C = c((l + c2)A + (l-2c2)r). 

Now that we have obtained explicit formulas for the metric and the Hessian, we are ready to prove the following 
result. 

Theorem 6.9. For all p £ *P_|_ and all sclfadjoint elements U G € we have 

Hessp 5(VC/, VL/) > ||VC/||2 . 
Proof. It follows directly from Lemmas |6.5|, |6.6L and |6.8| that for p and U as in these lemmas, 



where 



Hessp 5(V[/, WU) - || V[/||^ = u^P(p)u , 



A aC 

P{p)^Ni{p) + N2ip)-M{p)^ \ A bC 

^ aC bC B 

where 

i == A = (1 - c2)((l + c2)A - 2c^T) , 

B^{l-c^) + e{l + c2) + (1 + c2)2a + 2c?{l - c^)T , 

C = c(i(0 - 1) + (1 + c2)A + (1 - 2c2)r) . 

An elementary computation shows that a matrix of this form is positive definite if and only if A > 0, i? > 0, and 

A]3>C^{a^ + b^) . (68) 
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The proof of these inequahties rehes on the following one-dimensional inequalities, which shall be proved in Propo- 



sition 6.10 below 



< 2r < A , (69) 

(1 - ef < 4A . (70) 

In fact, the non- negativity of A and B follows immediately from (pa). In order to prove (pq) we write 

AB - &{a^ + \?) = (1 ~ ^){A + B) 

where 

A = {\^ c'^fi^^ - c^r^ - 2c2(i + c2)rA , 

6 = (1 + c2)(l + 0)A - c2(l + 30)r - ic2(l - Bf . 



Using (pa) we infer that 



A = {c^ + c2)A(A - 2r) + (1 + -c2)A2 + c2(iA2 - r^) 
>0. 



Furthermore, taking into account that < ^ < 1, using (69) once more, and finally ([70[), we obtain 

i3 = (1 + 6i)A - -c^{\ - Bf + c2((l + 6i)A - (1 + 36')r) 
> A - i(l - Bf + c2(l + 6')(A - 2r) 

>^-\{\-Bf 
>0, 
which completes the proof. D 



The following one-dimensional inequalities were essential in the proof of Theorem 6.9. 

Proposition 6.10. For — 1 < r < 1 we set B ^ /i(l — r,l + r) and 

£, = /i(l — r, 1 - r, 1 + r) , rj — fi{l — r, 1 + r, 1 + r) . 

Then the quantities 

J. TV-i , A ^ f ^ ^ 

1 = - — - — and A = 



4 B 4Vl-'rl + r 

satisfy the following inequalities: 

< 2r < A , (71) 

(l-6i)2<2A. (72) 

Proof. The first inequality from ( |7l| ) is clear from the monotonicity of ^. It follows from the 1-homogeneity of n 
that the second inequality in (Q) can be reformulated as 

l + ^^)M(l,l,c-)<fl + ^^)MM,c), (73) 
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where c = j^ . Using the identity 

/i(l,l,c) T^ - 1 



Mi,i,c-i) 1-^ 

it follows that ( [7^ ) is equivalent to 

G < -y£= , (74) 

where G = \/l — f^ is the geometric mean of 1 — r and 1 + r. Since ( [7^ is readily checked, we obtain (O). 
In order to prove (|7|), we use the identity 

4 V 1 - r^ 



Therefore the inequality (72) is equivalent to 



{i-er<e[^—^-i 



In view of the geometric- logarithmic mean inequality y/l — r^ < ^, it suffices to show that 

0{1 - e)^ < - 1 + r^ . 
By another application of this inequality, it even suffices to show that 

61(1-61)2 <6i_6)2 ^ 

which reduces to 6 < 1. This inequality holds by the concavity of 9, hence the proof is complete. D 

A Some identities from non-commutative calculus 

Throughout this section we let A be the collection of m x tti- matrices with complex entries. The subset of self-adjoint 
elements shall be denoted by Ah, and we let A+ be the collection of strictly positive elements in A. 
For x,y,z G Awe consider the contraction operation * : {A(^ A) x A^ A defined by 

{x (>^ y) * z :— xzy , (75) 

and linear extension. 

For a smooth function / : (0, oo) — > M we define 

l^ / (A), A = /i. 

Let X, y e A-i- with spectral decomposition X — '^T^i ^jXj and Y — X^fcLi Mfcy/c for some Xj,fik > and 
projections Xj,yk with X]i=i % ~ X]fe=i Vk = ^- We define the non-commutative derivative of / as 

m 

df{X,Y)^ ^ df{\,,fik)xj^yk. 

],k=l 

The relevance of df{X, Y) is due to the fact that it allows to formulate suitable versions of the chain rule in a 
non-commutative setting. 
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Proposition A.l. Let f : (0, oo) -^ M. be a smooth function. 

(1) (Discrete chain rule) For X^Y ^ A+ we have 

f(X) - f{Y) = df{X, Y)*{X-Y). (76) 

(2) (Chain rule) For a smooth curve t^^X{t) e A+ we have 

^J{X{t))^df{X{t),X{t))*X'{t) . (77) 

Proof. To prove ([fq), we write 

m 
m 

= Yl df{\.j,Hk){Xj- lJ'k)xjyk 

j,k=l 

m /in 

= Yj ^/(^j' Atfe)% «) yfe * ( Yj^^^~ iJ-p)^iyp 

^df{X,Y)*{X~Y), 

where we used that XjXiypyk — SjiSpkXjyk. 

The identity (|7^) is obtained by passing to the hmit in (^6[). D 

It will be useful to compute the non-commutative derivatives of some frequently occurring functions. 
Proposition A. 2. For A,Be A+ we have 

ri-l 

d[t^t"]{A,B) = ^A"-^-i(8)B^' , n = l,2,... , 

9exp(^,B)= / e^i-^^-^^e^^ds, 

d\og[A, B)= I ((1 - s)I + sAy^ ((1 - s)I + sB)-^ ds . 
Jq 

Proof. This follows from the following elementary identities, which hold for A, /i > 0: 

d[t ^ r](A,/x) = J2 A""'" V , n = 1, 2, . . . , 



^[t^t^]{X,^i)= / , ....^ ^—Apds, ae(0,l) 

Jo Jt) ((1 -s) + sA)((l-s) + s/i) 

9exp(A,Ai) = / e(i-*)^+*'' ds , 



aiog(A,/i)=/ . .,, r— rds. 

Jn ((l-s) + sA)((l -s) + s^) 



D 
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